| Dataset | Year | Programming Language | Data Source | Download Link |
|---|---|---|---|---|
| BigCloneBench | 2014 | Java | GitHub | Download |
| OJ dataset | 2016 | C++ | OJ Platform | Download |
| CodeSearchNet | 2019 | Go Java JavaScript PHP Python Ruby |
GitHub | Download |
| Code2Seq | 2019 | Java | GitHub | Download |
| Devign | 2019 | Java | GitHub | Download |
| Google Code Jam (GCJ) | 2020 | C++ Java |
OJ Platform | Download |
| CodeXGLUE | 2021 | Go Java JavaScript PHP Python Ruby |
GitHub | Download |
| CodeQA | 2021 | Java Python |
GitHub | Download |
| APPS | 2021 | Python | OJ Platform | Download |
| Shellcode_IA32 | 2021 | assembly language instruction | OJ Platform | Download |
| SecurityEval | 2022 | Python | GitHub | Download |
| LLMSecEval | 2023 | Python C |
GitHub | Download |
| PoisonPy | 2023 | Python | GitHub | not yet published |
| Attack Technique | Year | Venue | Attack Type | Target Models | Target Tasks |
|---|---|---|---|---|---|
| Quiring et al. | 2019 | USENIX Security | Black-box Attack | Random Forest LSTM |
Authorship Attribution |
| DAMP | 2020 | OOPSLA | White-box Attack | Code2Ve GGNN |
Method Name Prediction Variable Name Prediction |
| STRATA | 2020 | Arxiv | Black-box Attack | Code2Seq | Method Name Prediction |
| MHM | 2020 | AAAI | Black-box Attack | BiLSTM ASTNN |
Function Classification |
| Srikant et al. | 2021 | ICLR | White-box Attack | Seq2Seq | Method Name Prediction |